49 research outputs found

    Relation Networks for Object Detection

    Full text link
    Although it is well believed for years that modeling relations between objects would help object recognition, there has not been evidence that the idea is working in the deep learning era. All state-of-the-art object detection systems still rely on recognizing object instances individually, without exploiting their relations during learning. This work proposes an object relation module. It processes a set of objects simultaneously through interaction between their appearance feature and geometry, thus allowing modeling of their relations. It is lightweight and in-place. It does not require additional supervision and is easy to embed in existing networks. It is shown effective on improving object recognition and duplicate removal steps in the modern object detection pipeline. It verifies the efficacy of modeling object relations in CNN based detection. It gives rise to the first fully end-to-end object detector

    Multi-view PointNet for 3D Scene Understanding

    Full text link
    Fusion of 2D images and 3D point clouds is important because information from dense images can enhance sparse point clouds. However, fusion is challenging because 2D and 3D data live in different spaces. In this work, we propose MVPNet (Multi-View PointNet), where we aggregate 2D multi-view image features into 3D point clouds, and then use a point based network to fuse the features in 3D canonical space to predict 3D semantic labels. To this end, we introduce view selection along with a 2D-3D feature aggregation module. Extensive experiments show the benefit of leveraging features from dense images and reveal superior robustness to varying point cloud density compared to 3D-only methods. On the ScanNetV2 benchmark, our MVPNet significantly outperforms prior point cloud based approaches on the task of 3D Semantic Segmentation. It is much faster to train than the large networks of the sparse voxel approach. We provide solid ablation studies to ease the future design of 2D-3D fusion methods and their extension to other tasks, as we showcase for 3D instance segmentation.Comment: Geometry Meets Deep Learning Workshop, ICCV 201

    The Value of Autofluorescence Bronchoscopy Combined with White Light Bronchoscopy Compared with White Light Alone in the Diagnosis of Intraepithelial Neoplasia and Invasive Lung Cancer: A Meta-Analysis

    Get PDF
    ObjectiveTo compare the accuracy of autofluorescence bronchoscopy (AFB) combined with white light bronchoscopy (WLB) versus WLB alone in the diagnosis of lung cancer.MethodsThe Ovid, PubMed, and Google Scholar databases from January 1990 to October 2010 were searched. Two reviewers independently assessed the quality of the trials and extracted data. The relative risk for sensitivity and specificity on a per-lesion basis of AFB + WLB versus WLB alone to detect intraepithelial neoplasia and invasive cancer were pooled by Review Manager.ResultsTwenty-one studies involving 3266 patients were ultimately analyzed. The pool relative sensitivity on a per-lesion basis of AFB + WLB versus WLB alone to detect intraepithelial neoplasia and invasive cancer was 2.04 (95% confidence interval [CI] 1.72–2.42) and 1.15 (95% CI 1.05–1.26), respectively. The pool relative specificity on a per-lesion basis of AFB + WLB versus WLB alone was 0.65 (95% CI 0.59–0.73).ConclusionsAlthough the specificity of AFB + WLB is lower than WLB alone, AFB + WLB seems to significantly improve the sensitivity to detect intraepithelial neoplasia. However, this advantage over WLB alone seems much less in detecting invasive lung cancer

    One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

    Full text link
    Recent advancements in open-world 3D object generation have been remarkable, with image-to-3D methods offering superior fine-grained control over their text-to-3D counterparts. However, most existing models fall short in simultaneously providing rapid generation speeds and high fidelity to input images - two features essential for practical applications. In this paper, we present One-2-3-45++, an innovative method that transforms a single image into a detailed 3D textured mesh in approximately one minute. Our approach aims to fully harness the extensive knowledge embedded in 2D diffusion models and priors from valuable yet limited 3D data. This is achieved by initially finetuning a 2D diffusion model for consistent multi-view image generation, followed by elevating these images to 3D with the aid of multi-view conditioned 3D native diffusion models. Extensive experimental evaluations demonstrate that our method can produce high-quality, diverse 3D assets that closely mirror the original input image. Our project webpage: https://sudo-ai-3d.github.io/One2345plus_page

    Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation

    Full text link
    Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to quantitative comparison of newly emerged algorithms driven by the maturity of deep learning based approaches and clinical drive for resolving finer details of distal airways for early intervention of pulmonary diseases. Thus far, public annotated datasets are extremely limited, hindering the development of data-driven methods and detailed performance evaluation of new algorithms. To provide a benchmark for the medical imaging community, we organized the Multi-site, Multi-domain Airway Tree Modeling (ATM'22), which was held as an official challenge event during the MICCAI 2022 conference. ATM'22 provides large-scale CT scans with detailed pulmonary airway annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID-19 CTs with ground-glass opacity and consolidation. Twenty-three teams participated in the entire phase of the challenge and the algorithms for the top ten teams are reviewed in this paper. Quantitative and qualitative results revealed that deep learning models embedded with the topological continuity enhancement achieved superior performance in general. ATM'22 challenge holds as an open-call design, the training data and the gold standard evaluation are available upon successful registration via its homepage.Comment: 32 pages, 16 figures. Homepage: https://atm22.grand-challenge.org/. Submitte
    corecore